10042865 Results 



SEQ ID NO: 28 



Result 
No. 



Query 

Score Match Length DB ID 



Description 



1 
2 
3 
4 
5 
6 
7 
8 
9 
10 
11 
12 
13 
14 
15 



1753 .5 
1751 .5 
1741.5 
1741.5 
1732.5 
1732 .5 



1771 .5 
1770 



1716 
1709 



2824 
1794 
1786 
1786 
1784 



100 .0 
63 .5 
63 .2 
63 .2 
63 .2 
62.7 
62.7 
62 .1 
62.0 
61.7 
61.7 
61.3 
61.3 
60.8 
60.5 



533 
529 
529 
529 
533 
528 
524 
540 
530 
530 
530 
530 
530 
529 
532 



5 
5 
5 
6 
4 
3 
3 
4 
7 
2 
7 
3 
6 
4 
4 



ABB98414 
AA022794 
AAE28617 
ADA11076 
ABG05523 
AAY78933 
AAY78934 
ABG05525 
ADC39065 
AAW4 7126 
ADE58009 
AAY78935 
ABJ19806 
AAE02188 
ABG05524 



Abb98414 
Aao22794 
Aae28617 
Adall076 
Abg05523 
Aay78933 
Aay78934 
Abg05525 
Adc39065 
Aaw47126 
Ade58009 
Aay78 935 
Abjl9806 
Aae02188 
Abg05524 



Human NOV 
Protein o 
Human UGT 
Human cDN 
Novel hum 
Human UDP 
Human UDP 
Novel hum 
Novel hum 
Uridine d 
Human Pro 
Human UDP 
Androgen- 
Human bre 
Novel hum 



RESULT 6 
AAY78933 

ID AAY78933 standard; protein; 528 AA. 
XX 

AC AAY78933; 
XX 

DT 05-JUN-2000 (first entry) 
XX 

DE Human UDP-glucuronosyltransf erase 2B4 amino acid sequence. 
XX 

KW UDP-glucuronosyltransf erase 2B4; UGT2B4; polymorphism; metabolism; SNPs; 

KW drug interaction; detect; human; single nucleotide polymorphism. 

XX 

OS Homo sapiens. 
XX 

PN W0200006776-A1. 
XX 

PD 10-FEB-2000. 
XX 

PF 22-JUL-1999; 99WO-US016675 . 
XX 

PR 28-JUL-1998; 98US- 0094391P . 
XX 

PA (AXYS-) AXYS PHARM INC. 
XX 

PI Galvin M, Miller A, Penny L, Riedy M; 
XX 

DR WPI; 2000-195321/17. 

DR N-PSDB; AAZ95199. 
XX 

PT Novel human UDP-glucuronosyltransf erase sequence, polymorphisms for 

PT genotyping individuals to predict rate of metabolism of substrates and 

PT for identifying potential drug interactions. 
XX 

PS Disclosure; Page 36-37; 72pp; English. 
XX 

CC This sequence represents the human UDP-glucuronosyltransf erase 2B4 

CC (UGT2B4) amino acid sequence. UDP-glucuronosyltransf erase (UGTs) are a 

CC family of enzymes that catalyse the glucuronic acid conjugation of a wide 

CC range of endogenous and exogenous substrates. The UGT2B gene subfamily 

CC encode steroid metabolizing isoforms in the liver. Alteration of the 

CC expression or function of UGTs may effect drug metabolism. The invention 

CC relates to non- chromosomal nucleic acid molecules, which comprise human 

CC UGT2B sequence polymorphisms (see AAZ95051 -Z95110 ) . Probes which detect 



CC the UGT2B locus polymorphisms can be used to detect altered UGT2B 

CC metabolism of a substrate in an individual. The nucleic acid molecules 

CC comprising a human UGT2B sequence polymorphism can be used in screening 

CC assays for genotyping individuals, also to predict their rate of 

CC metabolism of UGT2B substrate, potential drug-drug interactions and 

CC adverse side effects. The polymorphisms can be used as single nucleotide 

CC polymorphisms (SNPs) for detecting genetic linkage related to phenotypic 

CC variation in activity or expression of UGT2B protein. The polymorphism 

CC containing nucleic acid molecules may also be used for generating 

CC genetically modified non-human animals and for obtaining site specific 

CC gene modification in cell lines 

XX 

SQ Sequence 528 AA; 

Query Match 62.7%; Score 1771.5; DB 3; Length 528; 

Best Local Similarity 66.2%; Pred. No. 1.3e-173; 

Matches 355; Conservative 54; Mismatches 116; Indels 11; Gaps 9 
Qy 1 MAMKWTSVLLLIQLSYYSSSGSCGNVPLWPMEYSPWMNIKTILDKLMQISHEVTVLTLSA 60 

hill! Mill I MINI I :|| hi llllllllhhl 1 1 1 1 1 1 II 

Db 1 MSMKWTSALLLIQLSCYFSSGSCGKVLVWPTEFSHWMNIKTILDELVQRGHEVTVLASSA 60 

Qy 61 SILVDPNITSVTKFEVYSISVIKDDFAGFFFTQQITKWIHDLPKHIFWFKCVPFKNILWE 120 

II III I IMII «|. I • I I = :| :||| II : hi 

Db 61 SISFDPNSPSTLKFEVYPVSLTKTEFED- IIKQLVKRWA-ELPKDTFWSYFSQVQEIMWT 118 

Qy 121 YSGYTEKFFKDWLNKKLMTNLQESRSDWHANAIGPFGELLAELLKISFVYSLHFSPGY 180 

II Ihl Mill Mill III hh Ml Ml hi HIM Mill 

Db 119 FNDILRKFCKDIVSNKKLMKKLQESRFDWLADAVFPFGELLAELLKIPFVYSLRFSPGY 178 

Qy 181 TFEKYSGGFLLPPSYGAVILSELSGSMTFMETVRNIIYVFYFDFWFQTFDMKKGDQFYSE 240 

hi I I MM MMIM llhl hhlll MMIM Mill I II I II 

Db 179 AIEKHSGGLLFPPSYVPWMSELSDQMTFIERVKNMIYVLYFEFWFQIFDMKKWDQFYSE 23 8 

Qy 241 VLGKSCFLSEIMGKAEMWLIRNYWYLEFPRPLLPNFEFWRLYCKPVNPLPKEKMEEFAQ 3 00 

Mh III I 1 1 = = 1 1 1 1 1 1 1 Ml Mill III IMII Mill MM I 

Db 239 VLGRPTTLSETMAKADIWLIRNYWDFQFPHPLLPNVEFVGGLHCKPAKPLPKE-MEEFVQ 297 

Qy 301 SSDEDG-WFSLESAVQNLTEEKADLITSALAQIPQKVM-KF-GRKPNTLRSNTQWHRWI 357 

II hi Mill I I I .||:|::| lllhllllh I I hi Ih "II 
Db 298 SSGENGVWFSLGSMVSNTSEERANVIASALAKIPQKVLWRFDGNKPDTLGLNTRLyKWI 357 

Qy 358 PQNECLILDHPQTKAFITYGGTNSIYEMIYRGVPSMGIPLFADQHDNIAHMKAKGAAVIL 417 

Mh = 1 Ihhlllhll I III II hi MMIIIM 1 1 1 1 1 1 i 1 1 1 1 1 1 I 

Db 35 8 PQND- -LLGHPKTRAFITHGGANGIYEAIYHGIPMVGVPLFADQPDNIAHMKAKGAAVSL 415 

Qy 418 DLSTKSSTDLLDISVFVSLFLSFRYKESVMKLSRIQHDQPVKPLDRAVFWIEFVMRHKGA 4 77 

I I MUM : :: Ih hllll II I II I II M M II II M II II II 

Db 416 DFHTMSSTDLL- -NALKTVINDPLYKENAMKLSRIHHDQPVKPLDRAVFWIEFVMRHKGA 473 

Qy 478 KHLRVAARDLTWFQYHSLDVIGFLLACVATVTFIITKCCLFCFWKFTRKVKKEKRD 533 

MMIM lllllllllll Hllll Mill MM Ml I II III 

Db 474 KHLRVAAHDLTWFQYHSLDVTGFLLACVATVIFIITK-CLFCVWKFVRTGKKGKRD 528 



Issued 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 



1 


1771.5 


62 


.7 


528 


4 


US-09-356-806-8 


Sequence 


8, 


Appli 


2 


1770 


62 


. 7 


524 


4 


US-09-356-806-40 


Sequence 


40 


, Appl 


3 


1741.5 


61 


. 7 


530 


3 


US-09-180-852-2 


Sequence 


2, 


Appli 


4 


1732 .5 


61. 


.3 


530 


4 


US-09-356-806-113 


Sequence 


113, App 


5 


1375.5 


48. 


.7 


454 


4 


US-09-813-918-2 


Sequence 


2, 


Appli 


6 


998 


35, 


,3 


288 


4 


US-09-813-918-3 


Sequence 


3, 


Appli 


7 


779 


27. 


.6 


531 


5 


PCT-US92-00282-6 


Sequence 


6, 


Appli 


8 


777.5 


27. 


,5 


534 


5 


PCT-US92-00282-4 


Sequence 


4 , 


Appli 


9 


777 


27. 


, 5 


533 


5 


PCT-US92-00282-3 


Sequence 


3, 


Appli 



10 


737 . 5 


26 


.1 


531 


5 


PCT-US92 - 


00282-5 


Sequence 5, Appli 


11 


689.5 


24 


.4 


529 


5 


PCT-US92 - 


00282-7 


Sequence 7, Appli 


12 


568 . 5 


20 


. 1 


245 


4 


US-09-305 


-856B-18 


Sequence 18, Appl 


13 


543 


19 


.2 


197 


4 


US-09-813 


-918-4 


Sequence 4, Appli 


14 


311 


11 


. 0 


98 


5 


PCT-US92- 


00282-26 


Sequence 26, Appl 


15 


281 


10 


.0 


515 


3 


US-08-942 


-012B-32 


Sequence 32, Appl 


suit 




Query 












No. 


Score 


Match Length DB 


ID 




Description 


1 


1756 


62 


.2 


529 


6 


097951 




097951 macaca fasc 


2 


1745 


61. 


.8 


529 


6 


Q9GLD9 




Q9gld9 macaca mula 


3 


1734 


61, 


, 4 


529 


6 


Q9TSL6 




Q9tsl6 macaca fasc 


4 


1730 


61. 


,3 


529 


6 


Q9GLE0 




Q9gle0 macaca mula 


5 


1700.5 


60. 


. 2 


528 


6 


Q8WN97 




Q8wn97 macaca fasc 


6 


1554.5 


55. 


0 


529 


11 


Q8R084 




Q8r084 mus musculu 


7 


1554 .5 


55 . 


0 


532 


11 


Q8K154 




Q8kl54 mus musculu 


8 


1545.5 


54 . 


7 


528 


11 


Q8VIF9 




Q8vif9 cavia porce 


9 


1481 


52 . 


4 


529 


11 


Q8VIF8 




Q8vif8 cavia porce 


10 


1473 


52 . 


2 


529 


11 


Q8BJL9 




Q8bjl9 mus musculu 


11 


1465.5 


51. 


9 


530 


11 


Q8K169 




Q8kl69 mus musculu 



Result Query 

No. Score Match Length DB 



ID 



Description 



1 


1786 


63 


.2 


529 


1 


UDB7_HUMAN 


P16662 


homo sapien 


2 


1771.5 


62 


.7 


528 


1 


UDB4_HUMAN 


P06133 


homo sapien 


3 


1741.5 


61 


.7 


530 


1 


UDBH_HUMAN 


075795 


homo sapien 


4 


1741 


61 


. 7 


529 


1 


UDB9_MACFA 


002663 


macaca fasc 


5 


1732.5 


61 


.3 


530 


1 


UDBF_HUMAN 


P54855 


homo sapien 


6 


1723 .5 


61 


.0 


528 


1 


UDBA_HUMAN 


P36537 


homo sapien 


7 


1721.5 


61 


.0 


528 


1 


UDBJ_MACFA 


Q9xt55 


macaca fasc 


8 


1716 


60 


.8 


529 


1 


UDBB_HUMAN 


075310 


homo sapien 


9 


1683 


59 


.6 


529 


1 


UDBS_HUMAN 


Q9by64 


homo sapien 


10 


1677.5 


59 


.4 


530 


1 


UDBK_MACFA 


077649 


macaca fasc 


11 


1563 .5 


55 . 


.4 


523 


1 


UDBG_RABIT 


019103 


oryctolagus 


12 


1553 


55, 


.0 


531 


1 


UDBD_RABIT 


P36512 


oryctolagus 


13 


1545.5 


54 , 


,7 


529 


1 


UDB1__RAT 


P09875 


rattus norv 


14 


1540.5 


54. 


,6 


530 


1 


UDBE_RABIT 


P36513 


oryctolagus 


15 


1508.5 


53. 


,4 


530 


1 


UDBC_RAT 


P36511 


rattus norv 


16 


1461.5 


51. 


,8 


530 


1 


UDB5_MOUSE 


P17717 


mus musculu 


17 


1445.5 


51. 


2 


530 


1 


UDB2_RAT 


P08541 


rattus norv 


18 


1437.5 


50. 


9 


530 


1 


UDB3_RAT 


P08542 


rattus norv 


19 


1413.5 


50. 


1 


530 


1 


UDB6_RAT 


P19488 


rattus norv 



SEQ ID NO : 2 7 



;ult 
No. 


Score 


Query 
Match 


Length DB 


ID 


1 


1606 


100 


.0 


1606 


6 


AX675577 


2 


1606 


100 


.0 


1606 


6 


AX921811 


3 


973 


60 


.6 


1855 


6 


AX336329 


4 


973 


60 


.6 


1855 


6 


AX336696 


5 


973 


60 


.6 


1855 


6 


AX409473 


6 


973 


60 


.6 


1855 


9 


HUMUDPGTA 


7 


966.6 


60 


.2 


1766 


9 


BC030974 


8 


966 .6 


60, 


.2 


1854 


6 


BD229166 


9 


966.6 


60, 


, 2 


1854 


6 


AR349418 


10 


962.4 


59. 


.9 


1639 


6 


AX548042 


11 


958.6 


59. 


,7 


2107 


6 


AR168316 


12 


958.6 


59. 


7 


2107 


9 


HSU59209 


13 


953.8 


59. 


4 


1753 


9 


AF016310 


14 


950.6 


59. 


2 


1976 


6 


BD229238 


15 


950.6 


59. 


2 


1976 


6 


AR349490 



Description 



AX675577 Sequence 
AX921811 Sequence 
AX336329 Sequence 
AX336696 Sequence 
AX409473 Sequence 
J05428 Human 3,4-c 
BC030974 Homo sapi 
BD229166 Genotype 
AR349418 Sequence 
AX548042 Sequence 
AR168316 Sequence 
U59209 Homo sapien 
AF016310 Macaca fa 
BD22 9238 Genotype 
AR349490 Sequence 



RESULT 6 
HUMUDPGTA 

LOCUS HUMUDPGTA 1855 bp mRNA linear PRI 03-AUG-1993 

DEFINITION Human 3,4-catechol estrogen UDP-glucuronosyltransf erase mRNA, 

complete cds . 
ACCESSION J05428 
VERSION J05428.1 GI: 340079 

KEYWORDS 3,4-catechol estrogen UDP-glucuronosyltransf erase . 
SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 (bases 1 to 1855) 

AUTHORS Ritter,J.K. , Sheen, Y . Y . and Owens, I . S . 

TITLE Cloning and expression of human liver UDP-glucuronosyltransf erase 

in COS-1 cells. 3,4-catechol estrogens and estriol as primary 
substrates 

JOURNAL J. Biol. Chem. 265 (14), 7900-7906 (1990) 
MEDLINE 90243659 
PUBMED 2159463 

COMMENT Original source text: Human liver, cDNA to mRNA, clone 63-11. 

Draft entry and computer- readable sequence for [1] kindly submitted 
by I.S.Owens, 22-FEB-1990. 
FEATURES Location/Qualifiers 
source 1. .1855 

/organism= 1l Homo sapiens" 
/mol_type= "mRNA" 
/db_xre f = 11 1 axon : 9 6 0 6 " 
CDS 15. .1604 

/note=" UDP-glucuronosyltransf erase (EC 2.4.1.17)" 
/codon_start=l 
/protein_id="AAA36793 .1" 
/db_xref ="GI : 340080" 

/ translation="MSVKWTSVILLIQLSFCFSSGNCGKVLVWAAEYSHWMNIKTILD 
ELIQRGHEVTVLASSASILFDPNNSSALKIEIYPTSLTKTELENFIMQQIKRWSDLPK 
DTFWL YFSQVQE IMS I FGD I TRKFCKD WSNKKFMKKVQE S RFD VI FAD A I FPCSELL 
AELFNIPFVYSLSFSPGYTFEKHSGGFIFPPSYVPWMSELTDQMTFMERVKNMIYVL 
YFDFWFEIFDMKKWDQFYSEVLGRPTTLSETMGKADVWLIRNSWNFQFPHPLLPNVDF 
VGGLHCKPAKPLPKEMEDFVQSSGENGVWFSLGSMVSNMTEERANVIASALAQIPQK 
VLWRFDGNKPDTLGLNTRLYKWI PQNDLLGHPKTRAF I THGGANG I YEA I YHG I PMVG 
IPLFADQPDNIAHMKARGAAVRVDFNTMSSTDLLNALKRVINDPSYKENVMKLSRIQH 
DQ P VKP LDRAVFW I E F VMRH KG AKH LRVAAHD LTWFQ YHS LD VI G FL L VC VAT V I F I V 
TKCCLFCFWKFARKAKKGKND " 

ORIGIN 

Query Match 60.6%; Score 973; DB 9; Length 1855; 

Best Local Similarity 78.3%; Pred. No. le-212; 

Matches 1265; Conservative 0; Mismatches 320; Indels 30; Gaps 7; 



Qy 1 ATGGCTATGAAATGGACTTCAGTCCTTCTGTTGATACAGCTGAGCTATTACTCTAGCTCT 60 

HI M llllllllllllllll II II I Mill lllllll II II I Ml, 

Db 15 ATGTCTGTGAAATGGACTTCAGTAATTTTGCTAATACAACTGAGCTTTTGCTTTAGCTCT 74 

Qy 61 GGG AGTTGTGG AAATGTG CCG C TGTGG C CCATGG AAT ATAGT CCTTGGATG AAT AT AAAG 12 0 

„ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M 

Db 7 5 GGG AAT TG TGGAAAGGTG CTGGTGTGGGC AG CAG AAT ACAGC C ATTGGATG AAT AT AAAG 134 

QY 121 ACAATCCTGGATAAACTTATGCAGATAAGTCATGAGGTGACTGTTCTAACATTGTCAGCT 180 

„ I Ml II II I III I Mill Mil I llllllllllllllll II III MUM 

Db 135 ACAATCCTGGATGAGCTTATTCAGAGAGGTCATGAGGTGACTGTACTGGCATCTTCAGCT 194 

Qy 181 TCCATTCTTGTTGATCCCAACATAACATCTGTTACTAAATTTGAGGTTTATTCTATATCT 24 0 

1QC 1 1 1 1 1 J [ 1 1 M M M I II II I MM II MM Mil Mill I I MM 

Db 195 TCCATTCTTTTTGATCCCAACAACTCATCCGCTCTTAAAATTGAAATTTATCCCACATCT 2 54 

Qy 241 GTAATTAAAGATGATTTTGCAGGGTTTTTTTTCACACAACAGATTACTAAATGGATACAT 300 

occ HI MM Ml M M III Ml IMMIIIII Mill I 

Db 255 TTAACTAAAACTGAGTT GGAGAATTTCATCATGCAACAGATTAAGAGATGGTCA 308 

Qy 301 GATCTTCCAAAACATATATTTTGGTTTAAATGTGTTCCCTTCAAGAATATTCTTTGGGAA 360 

M MM Ml II II M M I I I I I | Ml M M I 



Db 


309 GACCTTCCAAAAGATACATTTTGGTTATATTTTTCACAAGTACAGGAAATCATGTCAATA 368 


Qy 


361 TATTCTGGTTATACTGAGAAGTTCTTTAAAGATGTAGTTTTGAACAAGAAACTTATGACA 420 

i i ii iii i i i i i i i i i i i i i i i i i i i i i ii i i i i i i i i i i i i i 


Db 


! i II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 I M 
369 TTTGGTGACATAACTAGAAAGTTCTGTAAAGATGTAGTTTCAAATAAGAAATTTATGAAA 4 28 


Qy 


421 AACCTACAAGAATCAAGGTCTGATGTCGTTCATGCAAATGCCATTGGTCCCTTTGGAGAG 4 80 


Db 


II M ,11 Mill 1 III III II Mil MM III Mill 1 1 III 

429 AAAGTACAAGAGTCAAGATTTGACGTCATTTTTGCAGATGCTATTTTTCCCTGTAGTGAG 4 88 


Qy 


481 CTGCTGGCTGAGCTATTAAAAATATCCTTTGTGTACAGTCTCCACTTCTCTCCTGGCTAC 540 

i i i i i i i i i i i i i i i i i ii iii i i i i i i i i i i i i i i i i i i i i i i i i i i i i 


Db 


M M 1 1 1 1 1 1 1 II II 1 1 II Ml II 1 II II II II II 1 1 1 1 II II II II II 1 1 II M 

4 8 9 CTGCTGGCTGAGCTATTTAACATACCCTTTGTGTACAGTCTCAGCTTCTCTCCTGGCTAC 54 8 


Qy 


541 AC ATTTG AGAAAT ACAGTGG AGG ATTTCTACTT C CAC CTT CCTATGG AG C TGTTATT CTG 600 

M Mill II 1 II .IN 1 1 M II II M II 1 1 Mill II M 

54 9 ACTTTTGAAAAGCATAGTGGAGGATTTATTTTCCCTCCTTCCTACGTACCTGTTGTTATG 608 


Db 


Qy 


601 T CAGAATTAAGTGGTTCGATG ACATT CATGG AG AC AGT AAGAAATATTATAT ATG TGTTT 660 

I I I I I i I I 1 I ii i i i i i i i i i i i i i i i i i i i i i i i i i i ii i i i i i i ii 


Db 


M 1 1 II M II II Mill 1 II II 1 M II MM II 1 II 1 M 1 II 1 II II 

609 TCAGAATTAACTGATCAAATGACTTTCATGGAGAGGGTAAAAAATATGATCTATGTGCTT 668 


Qy 


661 TATTTTGACTTTTGGTTCCAAACATTTGATATGAAGAAGGGAGACCAGTTTTACAGTGAA 720 

II 1 1 1 1 1 M II 1 II II 1 III MUM hhllll, 1 II IIIIMM llllll 

669 TACTTTGACTTTTGGTTCGAAATATTTGACATGAAGAAGTGGGATCAGTTTTATAGTGAA 728 


Db 


Qy 


721 GTTCTAGGTAAGTCATGTTTTTTATCTGAGATAATGGGAAAAGCTGAAATGTGGCTCATT 780 
I I I I I I I I I i i I i i i i i i i i i iiiiii i i i i i i i i i i i i i i iii 


Db 


1 M II 1 II 1 1 1 M II II II II II 1 1 II M II II II 1 II II 1 1 II 

729 GTTCTAGGAAGACCCACTACGTTATCTGAGACAATGGGGAAAGCTGACGTATGGCTTATT 788 


Qy 


781 CGAAACTACTGGTATTTGGAATTTCCTCGCCCACTCTTACCTAATTTTGAATTTGTTGTA 84 0 

1 1 1 1 1 1 I I 1 I l I I l 1 I i i i i i i i i i i i i i i i i i i iii i i i i i i i i i < < i 

M 1 II M II II II II 1 II II II 1 II II M II 1 II Ml MM M II 1 II 1 

789 CGAAACTCCTGGAATTTTCAGTTTCCTCATCCACTCTTACCAAATGTTGATTTTGTTGGA 848 


Db 


Qy 


841 AGACTCTACTGCAAACCTGTCAACCCCCTGCCTAAGGAGAAAATGGAAGAATTTGCCCAG 900 
I I I I I I I i i i i i i i i i i iii i i i i i i i i i i i i i i i i i i t i i i i i i i i iii 


Db 


Mill 1 II II II II II 1 III II 1 1 1 1 1 1 1 1 II 1 II II 1 II II 1 MM Ml 

849 GGACTCCACTGCAAACCTGCCAAACCCCTGCCTAAGG - - - AAATGGAAGACTTTGTACAG 905 


Qy 


901 AGCTCTGATGAAGACGGTGTT GTGTTTTCTCTGGAGTCAGCTGTGCAAAACCTTACA 957 

I I I i I I i iii i iiiiii i i i i i i i i i i i i i tiii ii iii i iii 


Db 


M 1 II 1 1 1 II II 1 1 II 1 M II II II II 1 II M II II M 1 1 1 II 

906 AGCTCTGGAGAAAATGGTGTTGTGGTGTTTTCTCTGGGGTCAATGGTCAGTAACATGACA 965 


Qy 


958 GAAGAAAAAGCTGATCTTATCACTTCGGCCCTGGCTCAGATTCCACAAAAAGTCATGAAG 1017 
1 1 1 1 1 1 1 II I I II i ii i i i i i i i i i i i i i i i i i i i i i ii ii i 


Db 


llllll!' M 1 Ml 1 II M 1 II 1 M II 1 1 1 II M II II M II 1 

966 GAAGAAAGGGCCAACGTAATTGCATCAGCCCTGGCCCAGATCCCACAAAAGGTTCTGTGG 1025 


Qy 


1018 TTCGGAAGGAAACCAAATACCTTAAGATCCAATACTCAGTGGCATAGGTGGATC 1071 


Db 


II 1 llllll MINIM 1 MIIMII 1 1 Ml MINI 

1026 AGATTTGATGGGAATAAACCAGATACCTTAGGTCTCAATACTCGGCTGTATAAGTGGATA 108 5 


Qy 


1072 CCACAGAATGAATGTCTTATCCTAGATCATCCCCAAACCAAAGCCTTTATAACTTATGGT 1131 


Db 


inoc M MMMM 1 MM llllll 1 INI MIIMII MM Mill 

1086 CCCCAGAATGA CCTTCTAGGTCATCCAAAGACCAGAGCTTTTATAACTCATGGT 1139 


Qy 


1132 GGAACAAATAGCATCTATGAGATGATCTACCGTGGAGTCCCTTCCATGGGCATTCCTTTG 1191 

Ml 1 IM MMMI Ml MMMI III Mill MM Mill Ml 

114 0 GG AG C C AATGG CAT CT ACG AGG CAAT CT ACC ATGGG AT CC CT ATGGTGGGGATTCC ATTG 1199 


Db 


Qy 


1192 TTTGCGGACCAACATGATAACATTGCTCACATGAAGGCCAAGGGAGCAGCTGTTATATTG 1251 

Mill II MM MIIIMIMMIIIIIIIIIMMI II M 1 M M 1 II II 1 II 

12 00 TTTGCCGATCAACCTGATAACATTGCTCACATGAAGGCCAGGGGAGCAGCTGTTAGAGTG 1259 


Db 


Qy 


1252 GACTTGAGCACAAAGTCAAGTACAGATTTGCTCGATATATCTGTGTTCGTATCTTTATTT 1311 

UNI 1 Mill III MIIMII Mill II II 1 III 1 1 

12 60 GACTTCAACACAATGTCGAGTACAGACTTGCTGAATGCATTGAAGAGAGTAATTAATGAT 1319 


Db 


Qy 


1312 TT ATC C TT C AG AT ATAAAGAG AGTGTT ATG AAATT ATC AAG AATT CAAC ATGATCAAC C A 13 71 

i7on II Ml Ml Mill MIMIMM II MMMI Mill llllll Mill II 

132 0 CCTTC ATATAAAGAGAATGTTATGAAATTATCAAGAATTCAACATGATCAACCA 13 73 


Db 


Qy 


1372 GTGAAGCCCCTGGATCGAGCAGTCTTCTGGATTGAATTTGTCATGCGCCACAAAGGAGCC 1431 



MMMMMMMMMMMMMMMMMMMMIMMIMMIMMMM 

Db 1374 GTGAAGCCCCTGGATCGAGCAGTCTTCTGGATTGAATTTGTCATGCGCCACAAAGGAGCT 1433 

Qy 1432 AAACACCTTCGAGTTGCAGCCCGTGACCTCACCTGGTTCCAGTACCACTCTTTGGATGTG 1491 

_ I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | M I I I I 
Db 1434 AAACACCTTCGGGTTGCAGCCCACGACCTCACCTGGTTCCAGTACCACTCTTTGGATGTG 1493 

Qy 14 92 ATTGGGTTTCTGCTGGCCTGTGTGGCAACTGTGACATTTATCATCACAAAGTGTTGTCTG 1551 

I IIIIIM HIIMI Ml MM MM II II II Mlllll Ml MM IMIIIMI 

Db 14 94 ATTGGGTTCCTGCTGGTCTGTGTGGCAACTGTGATATTTATCGTCACAAAATGTTGTCTG 1553 

Qy 1552 TTTTGTTTCTGGAAGTTTACTAGAAAAGTGAAGAAGGAAAAAAGGGATTAGTTAT 1606 

II llllll II I MM III II II Mill Mlllll Mill I II Ml MM 

Db 1554 TTTTGTTTCTGGAAGTTTGCTAGAAAAGCAAAGAAGGGAAAAAATGATTAGTTAT 1608 



RESULT 12 

HSU59209 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



CDS 



Craniata; Vertebrata; Euteleostomi; 
Catarrhini; Hominidae; Homo. 



ORIGIN 



HSU59209 2107 bp mRNA linear PRI 02-JUL-1998 

Homo sapiens C19steroid specific UDP-glucuronosyltransf erase mRNA, 
complete cds . 
U59209 

U59209.1 GI:3287472 
UGT2B17G. 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Primates; 

1 (bases 1 to 2107) 
Beaulieu,M., Levesque,E., Hum,D.W. and Bel anger, A. 

Isolation and characterization of a novel cDNA encoding a human 

UDP-glucuronosyltransferase active on C19 steroids 

J. Biol. Chem. 271 (37), 22855-22862 (1996) 

96394358 

8798464 

2 (bases 1 to 2107) 

Hum,D.W., Belanger,A., Beaulieu,M. and Levesque,E. 
Direct Submission 

Submitted (23-MAY-1996) Laboratory of Molecular Endocrinology, 
Centre Hospitalier de l'universite Laval, 2705 Boul . Laurier, 
Ste-Foy, Quebec G1V 4G2, Canada 

Location/Qualifiers 

1. .2107 

/organi sm= 11 Homo sapiens " 
/mol_type= "mRNA" 
/db_xref ="taxon: 96 06" 
/cell_type="LNCaP" 
/tissue_type= "prostate" 
52. .1644 
/note="UGT2B17" 
/codon_start=l 

/product="C19steroid specific UDP-glucuronosyltransferase" 
/protein_id= " AAC2 5491.1" 
/db_xref ="GI : 3287473 " 

/translation^" MSLKWMSVFLLMQLSCYFSSGSCGKVLVWPTEYSHWINMKTILE 

ELVQRGHEVIVLTSSASILVNASKSSAIKLEVYPTSLTKNDLEDFFMKMFDRWTYSIS 

LAELLNIPFLYSLRFSVGYTVEKNGGGFLFPPSYVPWMSELSDQMIFMERIKNMIYM 

LYFDFWFQAYDLKKWDQFYSEVLGRPTTLFETMGKAEMWLIRTYWDFEFPRPFLPNVD 

FVGGLHCKPAKPLPKEMEEFVQSSGENGIWFSLGSMISNMSEESANMIASALAQIPQ 

KVLWRFDGKKPNTLGSNTRLYKWLPQNDLLGHPKTKAFITHGGTNGIYEAIYHGIPMV 

GIPLFADQHDNIAHMKAKGAALSVDIRTMSSRDLLNALKSVINDPIYKENIMKLSRIH 

HDQPVKPLDRAVFWIEFVMRHKGAKHLRVAAHNLTWIQYHSLDVIAFLLACVATMIFM 
I T KC CL FC FRKLAKTG KKKKRD " 



Query Match 59.7%; 
Best Local Similarity 77.3%; 
Matches 124 9; Conservative 



Score 958.6; DB 9 
Pred. No. 2.1e-209 
0; Mismatches 339 



Length 2107; 
Indels 27; Gaps 



Qy 



Db 



60 



1 ATGGCTATGAAATGGACTTCAGTCCTTCTGTTGATACAGCTGAGCTATTACTCTAGCTCT 

Ml M IMMMM MMM Mill MM Mill II I Mill Mlllll 

52 ATGTCTCTGAAATGGATGTCAGTCTTTCTGCTGATGCAGCTCAGTTGTTACTTTAGCTCT 111 



Qy 61 GGGAGTTGTGGAAATGTGCCGCTGTGGCCCATGGAATATAGTCCTTGGATGAATATAAAG 120 

IIMIMIIMIII 1 1 1 1 I Ml' II, Mill || | mill Mill III 

Db 112 GGGAGTTGTGGAAAGGTGCTGGTGTGGCCCACAGAATACAGCCATTGGATAAATATGAAG 171 

Qy 121 ACAATCCTGGATAAACTTATGCAGATAAGTCATGAGGTGACTGTTCTAACATTGTCAGCT 180 

MIIIIIIIM I Ml I MM II II III Mill III Mill II Ml 

Db 172 ACAATCCTGGAAGAGCTTGTTCAGAGGGGTCATGAGGTGATTGTGTTGACATCTTCGGCT 231 

Qy 181 TCCATTCTTGTTGATCCCAACATAACATCTGTTACTAAATTTGAGGTTTATTCTATATCT 240 

M MMMM II III I I MUM II 1 1 1 1 II II MUM Ml MM 

Db 232 TCTATTCTTGTCAATGCCAGTAAATCATCTGCTATTAAATTAGAAGTTTATCCTACATCT 291 

Qy 241 GTAATTAAAGATGATTTTGCAGGGTTTTTTTTCACACAACAGATTACTAAATGGATACAT 300 

Ml MM IIIMII I M llllll II MM II Mill I M 

Db 292 TT AACTAAAAATGATTTGGAAGATTTTTTTATGA AAATGTTCGATAGATGGACATAT 34 8 

Qy 301 GAT CTTC CAAAAC AT AT ATTTTGGTTT AAATG TGTT C C CTTC AAG AATATTC TTTGGG AA 360 

I M Mill III MMMM III I I I I I IMIMI 

Db 34 9 AGTATTTCAAAAAATACATTTTGGTCATATTTTTCACAACTACAAGAATTGTGTTGGGAA 4 08 

Qy 361 TATTCTGGTTATACTGAGAAGTTCTTTAAAGATGTAGTTTTGAACAAGAAACTTATGACA 420 

NIMH 1 1 1 1 I Ml III I llllll II IMI MINI lllll MM I I 

Db 409 TATTCTGACTATAATATAAAGCTCTGTGAAGATGCAGTTTTGAACAAGAAACTTATGAGA 468 

Qy 421 AACCTACAAGAATCAAGGTCTGATGTCGTTCATGCAAATGCCATTGGTCCCTTTGGAGAG 4 80 

M II Mil IMI II MIIIIII III Ml IIIMII MMMM III 

Db 4 69 AAACTACAAGAGTCAAAATTTGATGTCCTTCTGGCAGATGCCGTTAATCCCTGTGGTGAG 528 

Qy 4 81 CTGCTGGCTGAGCTATTAAAAATATCCTTTGTGTACAGTCTCCACTTCTCTCCTGGCTAC 540 

1 1 1 1 1 1 1 1 1 1 1 III I II III 1 1 II I Ml 1 1 III I III I III 1 1 1 IMIMI 

Db 529 CTGCTGGCTGAACTACTTAACATACCCTTTCTGTACAGTCTCCGCTTCTCTGTTGGCTAC 588 

Qy 541 ACATTTGAGAAATACAGTGGAGGATTTCTACTTCCACCTTCCTATGGAGCTGTTATTCTG 600 

III IMIMI I 1 1 MM MM Ml I II II II Ml II I I lllll M II 

Db 589 ACAGTTGAGAAGAATGGTGGAGGATTTCTGTTCCCTCCTTCCTATGTACCTGTTGTTATG 64 8 

Qy 601 TCAGAATTAAGTGGTTCGATGACATTCATGGAGACAGTAAGAAATATTATATATGTGTTT 660 

CAQ MM II 1 1 1 II 1 1 I MM II Mill Ml III lllll II II III II II 

Db 649 TCAGAATTAAGTGATCAAATGATTTTCATGGAGAGGATAAAAAATATGATATATATGCTT 708 

Qy 661 TATTTTGACTTTTGGTTCCAAACATTTGATATGAAGAAGGGAGACCAGTTTTACAGTGAA 72 0 

M NMM IMI III II III III MM MMMM I II Mill MM llllll 

Db 709 TATTTTGACTTTTGGTTTCAAGCATATGATCTGAAGAAGTGGGACCAGTTTTATAGTGAA 768 

Qy 7 21 GTTCTAGGTAAGTCATGTTTTTTATCTGAGATAATGGGAAAAGCTGAAATGTGGCTCATT 780 

MMMM I I I MM Mill Mill lllll I UNI, MM III 

Db 769 GTTCTAGGAAGACCCACTACATTATTTGAGACAATGGGGAAAGCTGAAATGTGGCTCATT 828 

Qy 781 CGAAACTACTGGTATTTGGAATTTCCTCGCCCACTCTTACCTAATTTTGAATTTGTTGTA 840 

llll Ml Ml MM M 1 1 1 1 II ! 1 1 ! II I IIIMII Ml IMI IMIMI I 

Db 829 CGAACCTATTGGGATTTTGAATTTCCTCGCCCATTCTTACCAAATGTTGATTTTGTTGGA 888 

Qy 84 1 AGACTCTACTGCAAACCTGTCAACCCCCTGCCTAAGGAGAAAATGGAAGAATTTGCCCAG 900 

UN llll Mill I III Ml IIIMII, MMMIIII llll III 

Db 889 GGACTTCACTGTAAACCAGCCAAACCCTTGCCTAAGG AAATGGAAGAGTTTGTGCAG 945 

Qy 901 AGCTCTGATGAAGACGG TGTTGTGTTTTCTCTGGAGTCAGCTGTGCAAAACCTTACA 957 

o i in ii i mi i ii iii mil mi mil mi i mi i ii 

Db 946 AGCTCTGGAGAAAATGGTATTGTGGTGTTTTCTCTGGGGTCGATGATCAGTAACATGTCA 1005 

Qy 95 8 GAAGAAAAAGCTGATCTTATCACTTCGGCCCTGGCTCAGATTCCACAAAAAGTCATGAAG 1017 

inA "Hill 1 1 I III I M Mill Ml Ml I MMMM M I I 

Db 1006 GAAGAAAGTGCCAACATGATTGCATCAGCCCTTGCCCAGATCCCACAAAAGGTTCTATGG 1065 
Qy 1018 TT CGG AAGGAAACC AAATACCTT AAG AT C C AAT ACT C AGTGG CAT AGGTGG ATC 1071 

IN Ml IMIIIII III I lllll Mill llll llll I 

Db 1066 AGATTTGATGGCAAGAAGCCAAATACTTTAGGTTCCAATACTCGACTGTATAAGTGGTTA 1125 

Qy 1072 CCACAGAATGAATGTCTTATCCTAGATCATCCCCAAACCAAAGCCTTTATAACTTATGGT 1131 

II IMIINI MM I IMIMI IMIIMMI MIIIMM lllll 



Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



1126 CCCCAGAATGACCTTCTT- 



- - GGTCATCCCAAAACCAAAGCTTTTATAACTCATGGT 1179 
1132 GG AACAAAT AG C AT CT ATG AG ATGATCT AC CG TGG AGT C C CTT CC ATGGG CATT C CTTTG 1191 

HIM Ml lllllllllll llllllll III Mill MINIUM Ml 

1180 GGAACCAATGGCATCTATGAGGCGATCTACCATGGGATCCCTATGGTGGGCATTCCCTTG 1239 

1192 TTTGCGGACCAACATGATAACATTGCTCACATGAAGGCCAAGGGAGCAGCTGTTATATTG 1251 

MIMIII I I I I I I I I t I I I I I I | I | I | | | | | | [ I I | | | | | | | | | M | | | M 
1240 TTTGCGGATCAACATGATAACATTGCTCACATGAAAGCCAAGGGAGCAGCCCTCAGTGTG 1299 

1252 GACTTGAGCACAAAGTCAAGTACAGATTTGCTCGATATATCTGTGTTCGTATCTTTATTT 1311 

IN I M II I IIMIMI MIIMIMI I II II M I M 

1300 GACATCAGGACCATGTCAAGTAGAGATTTGCTCAA TGCATTGAAGTCAGTCATT 1353 

1312 TTAT CCTTCAGATATAAAG AG AGTGTTATGAAATT AT C AAG AATT CAAC ATG AT C AAC CA 1371 

I I MIIMIMI I I 1 1 II I M 1 h 1 1 1 1 1 1 1 1 1 1 MIIMIIIII 

1354 AATGACCCTATCTATAAAGAGAATATCATGAAATTATCAAGAATTCATCATGATCAACCG 1413 
1372 GTGAAGCCCCTGGATCGAGCAGTCTTCTGGATTG AATT TGTC ATG CGCCACAAAGGAGCC 1431 

1 1 1 1 1 1 1 1 M M 1 1 II I II II 1 1 M I M M 1 1 1 M I M 1 1 1 1 M M M I 1 1 1 1 1| 1 1 1 

1414 GTGAAG CCCCTGG AT CG AGC AG TCTT CTGG AT TG AGTTTGTCATGCG CC ATAAAGG AG C C 14 73 
1432 AAACACCTTCGAGTTGCAGCCCGTGACCTCACCTGGTTCCAGTACCACTCTTTGGATGTG 14 91 

II IIHIIII II M 1 1 M I lllllllllll I II II Mllll III Ml MINI 

1474 AAGCACCTTCGGGTCGCAGCCCACAACCTCACCTGGATCCAGTACCACTCTTTGGATGTG 1533 
1492 ATTGGGTTTCTGCTGGCCTGTGTGGCAACTGTGACATTTATCATCACAAAGTGTTGTCTG 1551 

_ M I II lllllllllll 1 1 1 1 1 1 1 1 1 III Mllll MIMIII Mill Ml 

1534 ATAGCATTCCTGCTGGCCTGCGTGGCAACTATGATATTTATGATCACAAAATGTTGCCTG 1593 
1552 TTTTGTTTCTGGAAGTTTACTAGAAAAGTGAAGAAGGAAAAAAGGGATTAGTTAT 1606 

MINIMI I Ml MINI M MINI I NIMMNNNIM 

1594 TTTTGTTTCCGAAAGCTTGCCAAAACAGGAAAGAAGAAGAAAAGGGATTAGTTAT 1648 



Result 
No. 


Score 


Query- 
Match Length DB 


ID 


1 


1606 


100 


.0 


1606 


6 


ABN85391 


2 
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60 


.6 


1855 


6 


ABL68501 
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973 


60 


.6 


1855 


6 


ABL68868 


4 


973 


60 


.6 


1855 


6 


ABN95622 


5 


973 


60 


.6 


1855 


9 


ADD71099 


6 
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60 


.6 


1991 


6 


AAD45991 


7 
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60 


.4 


1714 


8 


ADA11075 


8 
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60 


.2 
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AAZ95200 


9 
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59 


.9 


1639 


6 


AAL41490 


10 
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59, 


. 8 


1859 


5 


AAS69710 


11 


958.6 


59. 


.7 


2107 


2 


AAV15900 


12 


958.6 


59. 


, 7 


3005 


9 


ADC39064 


13 


950.6 


59 . 


2 


1829 


9 


ADE53677 


14 


950.6 


59. 


2 


1976 


3 


AAZ95206 


15 


950.6 


59.2 


2090 


6 


ABK84210 



Description 



Abn85391 
Abl68501 
Abl68868 
Abn95622 
Add71099 
Aad45991 
Adall075 
Aaz95200 
Aal41490 
Aas69710 
Aavl5900 
Adc39064 
Ade53677 
Aaz95206 
Abk84210 



Human NOV 
Kidney ca 
Kidney ca 
Gene #212 
Human UDP 
Human UGT 
Human cDN 
Human UDP 
Drug met a 
DNA encod 
Uridine d 
Novel hum 
Human pro 
Human UDP 
Human cDN 



RESULT 8 
AAZ95200 

ID AAZ95200 standard; DNA; 1854 BP. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
XX 
OS 



AAZ95200; 

05-JUN-2000 (first entry) 

Human UDP-glucuronosyltransf erase 2B7 nucleotide sequence. 

UDP-glucuronosyltransf erase 2B7; UGT2B7; polymorphism; metabolism; SNPs; 
drug interaction; detect; human; single nucleotide polymorphism; ds . 

Homo sapiens . 



XX 

PN WO200006776-A1. 
XX 

PD 10-FEB-2000. 
XX 

PF 22-JUL-1999; 99WO-US016675 . 
XX 

PR 28-JUL-1998; 98US- 0094391P . 
XX 

PA (AXYS-) AXYS PHARM INC. 
XX 

PI Galvin M, Miller A, Penny L, Riedy M; 
XX 

DR WPI; 2000-195321/17. 

DR P-PSDB; AAY78934 . 
XX 

PT Novel human UDP-glucuronosyltransf erase sequence, polymorphisms for 

PT genotyping individuals to predict rate of metabolism of substrates and 

PT for identifying potential drug interactions. 
XX 

PS Disclosure; Page 41-44; 72pp; English. 
XX 

CC This sequence represents the human UDP-glucuronosyltransf erase 2B7 

CC (UGT2B7) gene. UDP-glucuronosyltransf erase (UGTs) are a family of enzymes 

CC that catalyse the glucuronic acid conjugation of a wide range of 

CC endogenous and exogenous substrates. The UGT2B gene subfamily encode 

CC steroid metabolizing isoforms in the liver. Alteration of the expression 

CC or function of UGTs may effect drug metabolism. The invention relates to 

CC non- chromosomal nucleic acid molecules, which comprise human UGT2B 

CC sequence polymorphisms (see AAZ95051-Z95110) . Probes which detect the 

CC UGT2B locus polymorphisms can be used to detect altered UGT2B metabolism 

CC of a substrate in an individual. The nucleic acid molecules comprising a 

CC human UGT2B sequence polymorphism can be used in screening assays for 

CC genotyping individuals, also to predict their rate of metabolism of UGT2B 

CC substrate, potential drug-drug interactions and adverse side effects. The 

CC polymorphisms can be used as single nucleotide polymorphisms (SNPs) for 

CC detecting genetic linkage related to phenotypic variation in activity or 

CC expression of UGT2B protein. The polymorphism containing nucleic acid 

CC molecules may also be used for generating genetically modified non-human 

CC animals and for obtaining site specific gene modification in cell lines 

SQ Sequence 1854 BP; 572 A; 338 C; 392 G; 552 T; 0 U; 0 Other; 

Query Match 60.2%; Score 966.6; DB 3; Length 1854; 
Best Local Similarity 78.1%; Pred. No. 4.2e-241; 

Matches 1261; Conservative 0; Mismatches 324; Indels 30; Gaps 7; 

Qy 1 ATGGCTATGAAATGGACTTCAGTCCTTCTGTTGATACAGCTGAGCTATTACTCTAGCTCT 60 

in ii nun mi inn i 11 n i inn mini n n mini 

Db !5 ATGT CTGTGAAATGGACTT C AGT AATTTTGCTAAT ACAACTG AG CTTTTGCTTT AG CT CT 74 



120 



Db 

Qy 
Db 



Qy 61 GGG AGTTG TGG AAATGTG C CG CTGTGGCC CATGG AATAT AGT CCTTGG ATGAAT AT AAAG 

mi mum iiii i inn i inn n i mmnmimi 

75 GGG AATTGTGG AAAGGTG CTGG TGTGGGC AG CAGAAT AC AG C CATTGG ATGAAT AT AAAG 134 
121 ACAATCCTGGATAAACTTATGCAGATAAGTCATGAGGTGACTGTTCTAACATTGTCAGCT 18 0 

minium i inn mi i nniininmii n m mm 

135 ACAATCCTGGATGAGCTTATTCAGAGAGGTCATGAGGTGACTGTACTGGCATCTTCAGCT 194 
^ TCCATTCTTGTTGATCCCAACATAACATCTGTTACTAAATTTGAGGTTTATTCTATATCT 24 0 

mi ii ii i ii mil iiii i 1 1 1 1 i i mi mi imii i i 

Db !95 TCCATTCTTTTTGATCCCAACAACTCATCCGCTCTTAAAATTGAAATTTATCCCACATCT 254 

Qy 241 GTAATTAAAGATGATTTTGCAGGGTTTTTTTTCACACAACAGATTACTAAATGGATACAT 3 00 

mini in ii ii in 1 1 1 ii n mi n iim i 

Db 255 TTAACTAAAACTGAGTT GGAGAATTTCATCATGCAACAGATTAAGAGATGGTCA 308 

Qy 301 GATCTTCCAAAACATATATTTTGGTTTAAATGTGTTCCCTTCAAGAATATTCTTTGGGAA 360 

n n iii mi iii ii ii inn i i i i r n i . i 

Db 309 GACCTTCCAAAAGATACATTTTGGTTATATTTTTCACAAGTACAGGAAATCATGTCAATA 368 



3 SI TATTCTGGTTATACTGAGAAGTTCTTTAAAGATGTAGTTTTGAACAAGAAACTTATGACA 420 
, co 1 1 11 111 I II M II I III III I II I I || || | I | | | | | | | | | | | 

369 TTTGGTGACATAACTAGAAAGTTCTGTAAAGATGTAGTTTCAAATAAGAAATTTATGAAA 428 

421 AACCTACAAGAATCAAGGTCTGATGTCGTTCATGCAAATGCCATTGGTCCCTTTGGAGAG 480 

,„ 11 in i in inn i iii iii ii mi mi iii inn i i in 

429 AAAGTACAAGAGTCAAGATTTGACGTCATTTTTGCAGATGCTATTTTTCCCTGTAGTGAG 488 
481 CTGCTGGCTGAGCTATTAAAAATATCCTTTGTGTACAGTCTCCACTTCTCTCCTGGCTAC 54 0 

iiiiiiiiiiiiiini ii in 1 1 1 1 f i j j 1 1 1 1 1 r i f i iiimmmmi 

489 CTGCTGGCTGAGCTATTTAACATACCCTTTGTGTACAGTCTCAGCTTCTCTCCTGGCTAC 548 
5 4 1 ACATTTGAGAAATACAGTGGAGGATTTCTACTTCCACCTTCCTATGGAGCTGTTATTCTG 600 

ii inn ii i illinium i i n niiini i i mmi n n 

549 ACTTTTGAAAAGCATAGTGGAGGATTTATTTTCCCTCCTTCCTACGTACCTGTTGTTATG 608 
601 TCAGAATTAAGTGGTTCGATGACATTCATGGAGACAGTAAGAAATATTATATATGTGTTT 660 

m 1 1 1 1 1 1 1 1 n i nm minim nn mm n mm n 

609 TCAGAATTAACTGATCSAATGACTTTCATGGAGAGGGTAAAAAATATGATCTATGTGCTT 66 8 
661 TATTTTGACTTTTGGTTCCAAACATTTGATATGAAGAAGGGAGACCAGTTTTACAGTGAA 720 

cco n i mm mi nn in mm minm i n mimiii nun 

669 TACTTTGACTTTTGGTTCGAAATATTTGACATGAAGAAGTGGGATCAGTTTTATAGTGAA 728 
721 GTTCTAGGTAAGTCATGTTTTTTATCTGAGATAATGGGAAAAGCTGAAATGTGGCTCATT 780 

-..Minm I I I I I I I I I I I | | | | | | | | | | | | j | | | Mllll III 

729 GTTCTAGGAAGACCCACTACATTATCTGAGACAATGGGGAAAGCTGACGTATGGCTTATT 788 
781 CGAAACTACTGGTATTTGGAATTTCCTCGCCCACTCTTACCTAATTTTGAATTTGTTGTA 840 

Minn nn nn i nm minimi m nn mini i 

789 CGAAACTCCTGGAATTTTCAGTTTCCATATCCACTCTTACCAAATGTTCATTTTGTTC 848 

841 *^^y^^y^^^CCTOTCAACCCCCTGCCTAAGGAGAAAATGGAAGAATTTGCCCAG 900 

Mill III II I III I I I II I I III I I I II III | f | | M I I I I I I I I Ml 
849 GGACTCCACTGCAAACCTGCCAAACCCCTGCCTAAGG AAATGGAAGACTTTGTACAG 905 

901 AGCTCTGATGAAGACGGTGTT---GTGTTTTCTCTGGAGTCAGCTGTGCAAAACCTTACA 957 

Mum iii i inni niinimm mm m mi i mi 

906 AGCTCTGGAGAAAATGGTGTTGTGGTGTTTTCTCTGGGGTCAATGGTCAGTAACATGACA 965 
9S8 GAAGAAAAAGCTGATCTTATCACTTCGGCCCTGGCTCAGATTCCACAAAAAGTCATGAAG 1017 

MMMI II I I II Ml IIMIMI Mill 1 1 1 1 1 1 1 1 II II I 

966 GAAGAAAGGGCCAACGTAATTGCATCAGCCCTGGCCCAGATCCCACAAAAGGTTCTGTGG 1025 



1018 



- TTCGGAAGGAAACCAAATACCTTAAGATCCAATACTCAGTGGCATAGGTGGATC 1071 

1026 AGATTTGATGGGAATAAACCAGATACCTTAGGTCTCAATACTCGGCTCTACAAGTGGAT^ 1085 

1072 CCACAGAATGAATGTCTTATCCTAGATCATCCCCAAACCAAAGCCTTTATAACTTATGGT 1131 

M I I I I I I I I | M III II I I I II I Ml I I I | I I I | I I I I I I 

1086 CCCCAGAATGA CCTTCTAGGTCATCCAAAGACCAGAGCTTTTATAACTCATC 1139 

1132 <^AACAAATAGCATCTATGAGATGATCTACCGTGGAGTCCCTTCCATGGGCATTCCTTTG 1191 

„,„ j.M i mi mini m iimn iii nm mm mi in 

H40 ggagccaatggcatctacgacmcaatctaccatgggatccctatggtggggattcca™ 1199 

1192 TTTGCGGACCAACATGATAACATTGCTCACATGAAGGCCAAGGGAGCAGCTGTTATATTG 1251 

... sa^^ l!59 

1260 GACTTCAACACAATGTCGAGTACAGACTTGCTGAATGCATTGAAGAGAGTAATTAATGAT 1319 
1312 TTATCCTTCAGATATAAAGAGAGTGTTATO 13?1 
1320 Cc4c------Si^^^ 



1373 



1372 GTGAAGCCCCTGGATCGAGCAGTCTTC^ 1431 

' 1 11 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1374 GTGAAGCCCCTGGATCGAGCAGTCTTCTGGATTGAAT^ 



1433 



Qy 1432 AAACACCTTCGAGTTGCAGCCCGTGACCTCACCTGGTTCCAGTACCACTCTTTGGATGTG 1491 

_ IIIIIIIIMI HMIIIIII IIIIIIIIIIIIIIIIIIIIIIIIIMMIIIIIII 

Db 1434 AAACACCTTCGGGTTGCAGCCCACGACCTCACCTGGTTCCAGTACCACTCTTTGGATGTG 1493 

Qy 14 92 ATTGGGTTTCTGCTGGCCTGTGTGGCAACTGTGACATTTATCATCACAAAGTGTTGTCTG 1551 

muni wiw II Will II I llllll 1 1 1 1 1 1 1 Ml M II MM || Ml 

Db 14 94 ATTGGGTTCCTGCTGGTCTGTGTGGCAACTGTGATATTTATCGTCACAAAATGTTGTCTG 1553 

Qy 1552 TTTTGTTTCTGGAAGTTTACTAGAAAAGTGAAGAAGGAAAAAAGGGATTAGTTAT 1606 

w w II Mil llll 1 1 II II II III II II Mill III II II IN II I 

Db 1554 TTTTGTTTCTGGAAGTTTGCTAGAAAAGCAAAGAAGGGAAAAAATGATTAGTTAT 1608 



RESULT 11 
AAV15900 

ID AAV15900 standard; cDNA ; 2107 BP. 
XX 

AC AAV15900; 
XX 

DT 26-MAY-1998 (first entry) 
XX 

DE Uridine diphospho-glucuronosyltransf erase 2B17 (UGT2B17) encoding cDNA. 

KW Uridine diphospho-glucuronosyltransf erase 2B17; UGT2B17; catalyse; 

KW androsterone; androsterone -glucuronic acid; androgen; enzyme; ss.' 
XX 

OS Homo sapiens. 
XX 

FH Ke y Location/Qualifiers 

FT 5'UTR i. .51 

FT /*tag= a 

FT CDS 52. .1644 

FT /*tag- b 

FT /product = "UGTB17 enzyme" 

FT 3'UTR 1645. .2107 

FT /*tag= c 

XX 

PN W09744466-A1. 
XX 

PD 27-NOV-1997. 
XX 

PF 16-MAY-1997; 97WO-CA000328 . 
XX 

PR 17-MAY-1996; 96US-00649319 . 
XX 

PA (ENDO-) ENDORECHERCHE INC. 
XX 

PI Belanger A, Hum DW, Beaulieu M, Levesque E; 

DR WPI; 1998-018520/02. 

DR P-PSDB; AAW47126. 
XX 

PT DNA encoding uridine di :phospho rglucuronosyl : transferase 2B17 - which 

PT catalyses conversion of androsterone to androsterone-glucuronic acid. 

Claim 15; Page 4-6; 53pp ; English. 

cr ™S mn^^? de l an enZyme uridine di-phosphoglucuronosyltransferase 
CC 2B17 (UGT2B17) . This novel enzyme catalyses the conversion of 

androsterone to androsterone-glucuronic acid. The UGT2B17 can be used to 
detect anti-UGT2B17 antibodies. The antibody can be used to detect a 
lu Ca }^ d concentration of UGT2B17 or an alteration in androgen activity. 
The UGT2B17 can also be used to alter the concentration of 2 androgenic 
compound in a tissue, specifically dihydrotestosterone . An isolated 
nucleotide sequence comprising at least 3 0 consecutive nucleotides from 
cr h^np^ 1 ? 9 ^ 91 lf u f thG 2107 baSe pair se ^ en ce, or its complement can 
cc ILIIL ^ th ! Synthesis of UGT2B17, e.g. an expression disrupting 
CC sense or antisense fragment, or as a probe for a UGT2B17 coding sequence 



XX 
PS 
XX 
CC 



CC 
CC 
CC 
CC 
CC 
CC 
CC 



XX 

SQ Sequence 2107 BP; 657 A; 382 C; 424 G; 644 T; 0 U; 0 Other; 

Query Match 59.7%; Score 958.6; DB 2; Length 2107; 

Best Local Similarity 77.3%; Pred. No. 5.4e-239; 

Matches 1249; Conservative 0; Mismatches 339; Indels 27; Gaps 



Qy 



Db 



ATGGCTATGAAATGGACTTCAGTCCTTCTGTTGATACAGCTGAGCTATTACTCTAGCTCT 

[imi .in iii ii nun mi inn ii i him mil" 



60 



Db 52 ATGTCTCTGAAATGGATGTCAGTCTTTCTGCTGATGCAGCTCAGTTGTTACTTTAGCTCT 111 

Qy 61 GGGAGTTGTGGAAATGTGCCGCTGTGGCCCATGGAATATAGTCCTTGGATGAATATAAAG 120 

Ml NIMH Mil Ml I III HUM III II II I 1 1 III I 1 1 II I III 

Db 112 GGGAGTTGTGGAAAGGTGCTGGTGTGGCCCACAGAATACAGCCATTGGATAAATATGAAG 171 

Qy 121 ACAATCCTGGATAAACTTATGCAGATAAGTCATGAGGTGACTGTTCTAACATTGTCAGCT 18 0 

mi mi ii i i iii i mi n ii mi mi m mm n m 

Db 172 ACAATCCTGGAAGAGCTTGTTCAGAGGGGTCATGAGGTGATTGTGTTGACATCTTCGGCT 231 

Q V 181 TCCATTCTTGTTGATCCCAACATAACATCTGTTACTAAATTTGAGGTTTATTCTATATCT 24 0 

m m inn ii in i i i m 1 1 n 1 1 m i n i m 1 1 in m i 

Db 232 TCTATTCTTGTCAATGCCAGTAAATCATCTGCTATTAAATTAGAAGTTTATCCTACATCT 291 

Qy 241 GTAATTAAAGATGATTTTGCAGGGTTTTTTTTCACACAACAGATTACTAAATGGATACAT 300 

OQ mi mi iiinii i ii mm m ii i i ii mil i n 

292 TTAACTAAAAATGATTTGGAAGATTTTTTTATGA AAATGTTCGATAGATGGACATAT 348 

Qy 301 GAT CTTC CAAAAC AT AT ATTTTGGTTTAAATGTG TTCC CTT CAAG AAT ATT CTTTGGG AA 360 

mi inn ii mini iii i m i i i i i mini 

Db 349 AGTATTTCAAAAAATACATTTTGGTCATATTTTTCACAACTACAAGAATTGTGTTGGGAA 4 08 



Qy 361 TATTCTGGTTATACTGAGAAGTTCTTTAAAGATGTAGTTTTGAACAAGAAACTTATGACA 420 

^ I Mill IMI I III Ml I I ! 1 1 [ I 1 1 M II 1 1 1 1 1 1 1 1 1 M I M M I I 

Db 4 09 TATTCTGACTATAATATAAAGCTCTGTGAAGATGCAGTTTTGAACAAGAAACTTATGAGA 468 

Qy 421 AACCTACAAGAATCAAGGTCTGATGTCGTTCATGCAAATGCCATTGGTCCCTTTGGAGAG 4 80 

m m m ii 1 1 1 1 i i m m i ii i in inn ii inn m m 

Db 4 69 AAACTACAAGAGTCAAAATTTGATGTCCTTCTGGCAGATGCCGTTAATCCCTGTGGTGAG 528 

Qy 481 CTGCTGGCTGAGCTATTAAAAATATCCTTTGTGTACAGTCTCCACTTCTCTCCTGGCTAC 540 

COQ Ml I III III I III I II Ml 1 1 Ml Ml M 1 1 II I II 1 1 M III III I III 

Db 529 CTGCTGGCTGAACTACTTAACATACCCTTTCTGTACAGTCTCCGCTTCTCTGTTGGCTAC 588 

Qy 541 ACATTTGAGAAATACAGTGGAGGATTTCTACTTCCACCTTCCTATGGAGCTGTTATTCTG 60 0 

m mm in i mi i ii in in Mmimmni inn n n 

Db 589 ACAGTTGAGAAGAATGGTGGAGGATTTCTGTTCCCTCCTTCCTATGTACCTGTTGTTATG 648 

Qy 601 TCAGAATTAAGTGGTTCGATGACATTCATGGAGACAGTAAGAAATATTATATATGTGTTT 660 

c i in ii mi ii m ii 1 1 mn mn in www www 11 11 

Db 64 9 TCAGAATTAAGTGATCAAATGATTTTCATGGAGAGGATAAAAAATATGATATATATGCTT 70 8 

Qy 661 TATTTTGACTTTTGGTTCCAAACATTTGATATGAAGAAGGGAGACCAGTTTTACAGTGAA 72 

Db 709 ii^iiiXI JLIillllll^ii i ^11 1. II J J J I J I M M III II I III II I III 



' mi ' ' i i ll l l ll l I I Ml Ml I I I I I I II I I M | III 

TATTTTGACTTTTGGTTTCAAGCATATGATCTGAAGAAGTGGGACCAGTTTTATAGTGAA 



768 



Qy 721 GTTCTAGGTAAGTCATGTTTTTTATCTGAGATAATGGGAAAAGCTGAAATGTGGCTCATT 780 

m in in i i i 1 1 r i 1 1 1 1 1 1 1 r 1 1 1 1 1 1 r i f 1 1 1 r r 1 1 1 1 1 1 1 1 1 r 

Db 769 GTTCTAGGAAGACCCACTACATTATTTGAGACAATGGGGAAAGCTGAAATGTGGCT 828 

Qy 781 CGAAACTACTGGTATTTGGAATTTCCTCGCCCACTCTTACCTAATTTTGAATTTGTTGTA 84 0 

„ o jj." 111 "I Ml MM MM Ml Ml I II II 1 1 III II 1 1 I III II I I 

Db 829 CGAACCTATTGGGATTTTGAATTTCCTCGCCCATTCTTACCAAATGTTGATTTTGTTGGA 888 

Qy 841 AGACTCTACTGCAAACCTGTCAACCCCCTGCCTAAGGAGAAAATGGAAGAATTTGCCCAG 900 

a MM MM MM I III III I llllll II III Ml IMI Mil Ml 

Db 889 GG AC TTC ACTG T AAACC AG CCAAAC C CTTG CCTAAGG AAATGG AAG AGTTTGTGCAG 945 

Qy 901 AGCTCTGATGAAGACGG TGTTGTGTTTTCTCTGGAGTCAGCTGTGCAAAACCTTACA 957 

mm mi mi ii in minimi ii in i mi n 

Db 94 6 AGCTCTGGAGAAAATGGTATTGTGGTGTTTTCTCTGGGGTCGATGATCAGTAACATGTCA 1005 



Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 
Qy 

Db 

Qy 

Db 



958 GAAGAAAAAGCTGATCTTATCACTTCGGCCCTGGCTCAGATTCCACAAAAAGTCATGAAG 1017 

IIIIHI II I Ml I II Mill II Mill llllllll || I l 

1006 GAAGAAAGTGCCAACATGATTGCATCAGCCCTTGCCCAGATCCCACAAAAGGTTCTATGG 1065 
1018 TTCGGAAGGAAACCAAATACCTTAAGATCCAATACTCAGTGGCATAGGTGGATC 1071 

M I HI llllllll III | MINIMI I III MM I 

1066 AGATTTGATGGCAAGAAGCCAAATACTTTAGGTTCCAATACTCGACTGTATAAGTGGTTA 1125 
1072 CCACAGAATGAATGTCTTATCCTAGATCATCCCCAAACCAAAGCCTTTATAACTTATGGT 1131 

_ II IMMI MM I 1 1 1 M 1 1 IMIIIMM MIIMM MM 

1126 CCCCAGAATGACCTTCTT GGTCATCCCAAAACCAAAGCTTTTATAACTCATGGT 1179 

1132 GGAACAAATAGCATCTATGAGATGATCTACCGTGGAGTCCCTTCCATGGGCATTCCTTTG 1191 

110 mil III 1 1 1 1 1 1 1 1 1 1 1 MMMM III MM MMMMM III 

1180 GGAACCAATGGCATCTATGAGGCGATCTACCATGGGATCCCTATGGTGGGCATTCCCTTG 1239 

1192 TTTGCGGACCAACATGATAACATTGCTCACATGAAGGCCAAGGGAGCAGCTGTTATATTG 1251 

millM 1 1 M I M I I M I II I I I I M I I M I M I M I II I I I I | | | || 
1240 TTTGCGGATCAACATGATAACATTGCTCACATGAAAGCCAAGGGAGCAGCCCTCAGTGTG 1299 

1252 GACTTGAGCACAAAGTCAAGTACAGATTTGCTCGATATATCTGTGTTCGTATCTTTATTT 1311 

nnn Ml I MM I I Ml MMMMM I II 1 1 || I II 

1300 GACATCAGGACCATGTCAAGTAGAGATTTGCTCAA TG CATTG AAGT CAGT CAT T 13 53 

1312 TTATCCTTCAGATATAAAGAGAGTGTTATGAAATTATCAAGAATTCAACATGATCAACCA 1371 

1 I MMMMM I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1354 AATGACCCT AT CT ATAAAG AG AATAT CATG AAATT ATCAAGAATTCAT CATG AT C AAC CG 1413 
1372 GTGAAGCCCCTGGATCGAGCAGTCTTCTGGATTGAATTTGTCATGCGCCACAAAGGAGCC 14 31 

' I II 1 1 1 [ 1 1 1 1 M 1 1 1 f 1 1 ) 1 1 M 1 1 1 M 1 1 1 1 1 I! 1 1 1 1 1 1 1 1 1 1 M MMMII 

1414 GTGAAGCCCCTGGATCGAGCAGTCTTCTGGATTGAGTTTGTCATGCGCCATAAAGGAGCC 14 73 
1432 AAACACCTTCGAGTTGCAGCCCGTGACCTCACCTGGTTCCAGTACCACTCTTTGGATGTG 14 91 

H MMMII II IMMI MMMMM 1 1 1 II II II II I III III II II I 

1474 AAGCACCTTCGGGTCGCAGCCCACAACCTCACCTG^ 153 3 
1492 ATTGGGTTTCTGCTGGCCTGTGTGGCAACTGTGACATTTATCATCACAAAGTGTTGTCTG 1551 

M I M MMMMM MMMM Ml IMMI MMMII MM Ml 

1534 ATAGCATTCCTGCTGGCCTGCGTGGCAACTATGATATTTATGATCACAAAATGTTC 1593 
1552 ^'yy^'y^'^GTGGAAGTTTACTAGAAAAGTGAAGAAGGATU^AAAGGGATTAGTTAT 1606 

MMMII I Ml II I Ml II MM I 1 1 1 1 1 1 1 II 1 1 1 M 1 1 

1594 TTTTGTTTCCGAAAGCTTGCCAAAACAGGAAAGAAGAAGAAAAGG^ 1648 
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Description 



Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 



39, Appl 
1, Appli 
112, App 
1, Appli 
241, App 
1, Appli 
114, App 
41, Appl 
412, App 
1, Appli 
4 03, App 



Description 



1 798.4 49.7 2079 14 CD013998 

2 784.2 48.8 1946 11 AK050435 

3 777.8 48.4 2573 11 AK004971 



CD013998 90117389 
AK050435 Mus muscu 
AK004971 Mus muscu 
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AK034801 
AK002736 
AK083294 
CD013996 
CD013997 
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BF689099 



Mus muscu 
Mus muscu 
Mus muscu 
90117309 
90117357 
Mus muscu 
Mus muscu 
BX444042 
90130114 
90130122 
Mus muscu 
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